home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Amiga Format CD 44
/
Amiga Format CD44 (1999-08-26)(Future Publishing)(GB)(Track 1 of 3)[!][issue 1999-10].iso
/
-in_the_mag-
/
basics
/
amos
/
optim_tips.lha
/
AMOS_Optimisations.txt
Wrap
Text File
|
1997-04-09
|
21KB
|
661 lines
>> AMOS Code Optimisation Tips <<
-----------------------------
Aminet - dev/amos/Optim_Tips.lha
Compiled by:
Ben Wyatt bwyatt@paston.co.uk
Contributors:
Garfield Benjamin gbenjam@ix.netcom.com
Ben Wyatt bwyatt@paston.co.uk
Paul Hickman ph@doc.ic.ac.uk
Tim Lewis T.Lewis@bton.ac.uk
Petri Hakkinen mystic@tlti.tokem.fi
Mark de Jong JP.de.Jong@inter.nl.net
Paul Heald u5o83@cc.keele.ac.uk
Here are a few optimisation tips for those of you trying to squeeze that
little bit more speed out of code. With the follow examples, the speed
increase may not be noticeable, but they will all make it that little bit
faster.
Important: Do not go through your entire AMOS program collection, converting
all your code so it's slightly faster. Only alter code which you think is
/too/ slow. For example, don't change your 3D renderer just to cut down the
rendering time 1/50 of a second or something silly. And similarly, don't
bother fiddling with the main loop of a game when it already runs in a Vbl!
A lot of these optimisation tips produce some messy code so often it isn't
worth changing things!
Note: Some of these speed increases only work when compiled, mainly with
the AMOSPRO Compiler. A lot of them are only tested this way. It must also
be stated that these tips may be incorrect for some CPUs - they have mainly
been tested on an 68020, but caches and different instruction cycle timings
could alter the results. Some sort of chart may be available in future
releases of this document - the best idea is to experiment on your system.
Each tip is marked by a *, followed by the amount of speed increase, shown
by a S, M, L or V. These stand for Small, Medium, Large and Varies.
------------------------------------------------------------------------------
>> COMPARISONS <<
-------------
*S Don't use Sgn(n), use n<0 and n>0 instead.
Eg.
Don't use:
If Sgn(n)=-1 Then blah blah
If Sgn(n)=1 Then blah de blah
Use this instead:
If n<0 Then blah blah
If n>0 Then blah de blah
*M Use Abs(n)<v instead of n>-v and n<v.
Eg.
Don't use:
If n>-5 and n<5 Then blah
Use this instead:
If Abs(n)<5 Then blah
*S Use < and > instead of <= and >= whenever possible. They are slightly
faster. Strange, but true.
Eg. When comparing immediate-values
Don't use:
If n<=10 Then blah
Use this instead:
If n<11 Then blah
But, when using variables, don't use this kind of thing:
A=10
If B>A-1 Then etc
instead of:
A=10
If B>=A Then etc
*M Don't do stuff like If n=True, If n=False, or If n<>0.
Eg.
Don't use:
If a=True or b=False or c<>0 Then blah blah
Use this instead:
If a or Not(b) or c Then blah blah
But ONLY do this if you are sure b is a boolean value. "Not" simply
bitflips an integer so Not -1 is 0, but Not 2 is $FFFFFFFE.
AMOSPro is rather buggy with the Not() command, so you may have to
experiment a bit.
Also, using =False instead of =0 is faster. Another trick you should be
aware of is that you can use the optimised version of <>0 where you
would normally use >0 (or <0) if the routine only returns values >=0.
Eg.
Don't use:
If Instr(a$,"Something")>0 Then blah
Do this instead:
If Instr(a$,"Something") Then blah
------------------------------------------------------------------------------
>> ARRAY ACCESS <<
--------------
*M For your most used array of 8 (or less) elements, use Dreg() instead of
it (it can still hold the same range of values). Note: The Dreg() array
isn't actually using the data registers; it's an AMOS internal data
structure. However, it's still faster than a standard array.
Eg.
Don't use:
Dim n(7)
For a=0 to 7:Print n(a):Next a
Use this instead:
For a=0 To 7:Print Dreg(a):Next a
*S Use single dimensional arrays instead of two dimensional ones if
appropriate.
Eg.
Don't use:
Dim A(2,100)
Use this instead:
Dim A0(100),A1(100),A2(100)
------------------------------------------------------------------------------
>> MATH OPERATIONS <<
-----------------
*L Whenever possible use powers of 2 when multiplying or dividing. The
AMOSPro Compiler will automatically optimise these to lsl and lsr.
DON'T use the lsl or lsr commands in extensions, as they are over
TWICE as slow!
Eg.
Don't use:
n=a*30
Use this instead (if suitable):
n=a*32
As well as this, you can also do:
n=x*160 -> n=x*(128+32) -> n=x*128+x*32 (tadaa)
Also n=x*192 -> n=x*128+x*64 etc.
But this kind of thing is not faster:
a=32 : b=7 : c=a*b
*L Never use floating points. Use integers multiplied by a power of 2. You
can actually decide what sort of precision you want in your decimal places
and then multiply them out. For example, if you decide on about 2 point
accuracy, you can multiply all your values by 128 (2^7, close enough to
100) and then do the calculations. When you have the results simply divide
by 128 to get the required result.
Eg.
Don't use:
x#=x#*1.5
Plot x#,100
Use this instead:
x=x*(3*128) : Rem Same as x=x*(1.5*256)
Rem 3*128 is calculated at compilation time
Plot X/256,100
*L Predefine as much as possible. Especially useful for Sin and Cos etc.
Eg.
Don't use:
Repeat
If Sqr(x*x+y*y)=10 Then blah blah blah
Until Something
Use this instead:
' This code should be placed at the beginning of your code.
xmax=Maximum value of x : ymax=Maximum value of y
Dim QUICKSQR(xmax,ymax)
For x=0 to xmax
For y=0 to ymax
QUICKSQR(x,y)=SQR(x*x+y*y)
Next y
Next x
' [Other code]
Repeat
If QUICKSQR(Abs(x),Abs(y))=10 Then blah blah blah
Until Something
*S Use N=N+A instead of using Inc, Dec or Add. Although the manual claims
they are faster, when using the AMOSPro Compiler, they actually turn
out slightly slower.
Eg.
Don't use:
Inc a : Dec b : Add c,10
Use this instead:
a=a+1 : b=b-1 : c=c+10
Note: Although Add is slower in it's short form, the full version
with the base To top part is faster than the equivilant code.
This only applies to standard variables - DON'T use this with arrays!
------------------------------------------------------------------------------
>> MISCELLANEOUS <<
---------------
*L Get yourself a fast extension, such as AMCAF or Turbo Plus. This will
only speed things up where an extension command replaces an internal
command with a faster one - Easylife and AMCAF will do this for many
string operations and, AMCAF, Turbo and Powerbobs do it for graphics.
Eg.
Don't use:
Plot x,y,c
Use this instead:
Turbo Plot x,y,c (for AMCAF)
or F Plot x,y,c (for Turbo Plus)
Many extensions can be found on Aminet (FTP from wuarchive.wustl.edu
in systems/amiga/aminet/dev/amos).
*V Use assembler procedures for inner loops. Beware though, it does take
a small amount of CPU time to jump to an assembler procedure, so with
small routines, it's simply not worth it.
*S Use "Copper Off" to disable the screen while doing intense communication.
The copper then stops stealing clock cycles from the processor (Unlike
Multi No and such instructions, this really does work). Use "Copper On"
to re-enable the normal system.
*L Use Fill to clear an array to a certain value, rather than the
equivilant For / Next loop.
Eg.
Don't use:
For n=0 to 1000
A(n)=27
Next n
Use this instead:
Fill Varptr(A(0)) To Varptr(A(1000)),27
*S Use global variables to return parameters from procedures, rather than
Param.
Eg.
Don't use:
_HELLO[10,20] : pram=Param : Print pram
Procedure _HELLO[x,y]
temp=0
For n=x To y
If a(n)=0 Then temp=n : Exit
Next n
End Proc[temp]
Use this instead:
Global pram
_HELLO[10,20] : Print pram
Procedure _HELLO[x,y]
pram=0
For n=x To y
If a(n)=0 Then pram=n : Pop Proc
Next N
End Proc
However, don't do things like this, which are actually slower.
Don't do:
_HELLO[10,20] : Print A
Procedure _HELLO[X,Y]
Shared A
A=X*Y
End Proc
Do this:
_HELLO[10,20] : Print Param
Procedure _HELLO[X,Y]
End Proc[X*Y]
*L Use subroutines instead of procedures. Although they're messier, they are
several times faster! Of course, with this method, you can not pass
parameters, but all the variables you were using will still be accessible.
Eg.
Don't use:
_SOMETHING
Procedure _SOMETHING
Code
End Proc
Use this instead:
Gosub _SOMETHING
_SOMETHING:
Code
Return
------------------------------------------------------------------------------
>> TRADITIONAL OPTIMISATIONS <<
---------------------------
*V Move invariant statement outside loops (Standard optimising compiler
stuff).
Eg.
If you have something like this:
Repeat
If x<a*20 Then do something groovy
Until something
If the rest of the loop does not modify a, this can become:
temp=a*20
Repeat
If x<temp Then do something groovy
Until something
*V Unroll your Loops!! Cuts down on Stack-Access eliminating Loop-overhead
and replacing it with dedicated code. Don't go TOO overboard though.
Normally five iterations is enough to gain a bit of speed. Too many
iterations unrolled will actually lose speed on systems with CPU caches.
Eg.
Don't use:
For X=1 To 20
UPDATE_ALIEN[X]
Next X
Use this instead:
For X=1 To 16 Step 5
UPDATE_ALIEN[X]
UPDATE_ALIEN[X+1]
UPDATE_ALIEN[X+2]
UPDATE_ALIEN[X+3]
UPDATE_ALIEN[X+4]
Next X
Note: You should only carry out this optimisation on small segments of
code, such as plotting pixels, otherwise your code can get very bloated.
*M Eliminate stack access (the reason Procedures are slower than Gosubs)
Eg.
Don't use:
For X=1 To 20
Gosub MOVE_ALIEN : Rem This uses the stack 40 times
Next X
' Other code
MOVE_ALIEN:
Moves Alien X
Return
Use this instead:
Gosub MOVE_ALIENS : Rem This only uses the stack twice.
' Other code
MOVE_ALIENS:
For X=1 To 20
Move Alien X
Next X
Return
Also, try to move local variables out of procedures to make them global.
They are significantly faster to use.
------------------------------------------------------------------------------
>> GRAPHICS <<
----------
*M Use Cls col,x1,y1 To x2,y2 instead of Bar x1,y1 To x2,y2.
Note: Use R Bar in Turbo instead if possible.
*S Use Polyline to draw boxes instead of Box.
Note: Use R Box in Turbo instead if possible.
Eg.
Don't use:
Box 10,10 To 10,10
Use this instead:
Polyline 10,10 To 20,10 To 20,20 To 10,20
------------------------------------------------------------------------------
>> STRINGS <<
---------
*L Use Peek(Varptr(a$)+x-1) instead of Asc(Mid$(a$,x,1)) and
Poke Varptr(a$)+x-1,Asc(b$) instead of Mid$(a$,x,1)=b$.
Eg.
Don't use:
A=Asc(Mid$(a$,3,1))
Mid$(a$,6,1)=b$
Use this instead:
A=Peek(Varptr(a$)+2)
Poke Varptr(a$)+5,Asc(b$) : Rem Asc(b$) could be predefined
Note: This is just under three times as fast!
*M This routine is for making the string 1 character shorter. It may be
helpful in an program to replace the input command (if you'd ever need
to optimise such a routine which is unlikely).
Eg.
Don't use:
S$=Space$(1000)
For X=1 To 1000
S$=Left$(S$,Len(S$)-1)
Next X
Use this instead:
S$=Space$(1000)
POS=Varptr(S$) : Rem Put this statement inside the loop if you
Rem do any standard string operations.
For X=1 To 1000
Doke POS-2,Deek(POS)-1
Next X
Note: This routine could be rather risky, as it is done outside the AMOS
string handling system.
*S Don't use Chr$(code) in routines but replace them with a string.
Eg.
Don't use:
I=Instr(ST$,Chr$(0))
Use this instead:
C0$=Chr$(0) : Rem At the beginning of your prog
I=Instr(ST$,C0$) : Rem In the routine
*V A quasi-optimisation: Use "n=Free" at points where you don't care about
the speed, to reduce the chances of the AMOS string garbage collector
being called when you do care about the speed. This will only have an
effect when strings are used in the routine.
------------------------------------------------------------------------------
>> STYLE OPTIMISING <<
------------------
*S Use a Repeat / Until loop instead of a For / Next loop.
Eg.
Don't use:
For n=0 to 10
[code]
Next n
Use this instead:
n=0
Repeat
[code]
n=n+1
Until n=11 : Rem Must be 1 more than with the original For
*S Use "If condition Then code" instead of "If condition : code : End If",
if the code is short.
Eg.
Don't use:
If a=2
la-de-da
End If
Use this instead:
If a=2 Then la-de-da
*S Don't use Start(BANK), Screen Width or Screen Height in a routine that
must be fast. In fact, do the same with any AMOS variable that won't
change in the loop, including Joy(), JoyUp(), Fire(), etc. if you have
several references to these in the same "loop".
Eg.
Don't use:
For X=1 To 10000
PE=Peek(Start(10)+X)
Next X
Use this instead:
ST=Start(BANK)
For X=1 To 10000
PE=Peek(ST+X)
Next X
*M Use the value of expressions, rather than testing what they mean.
Eg.
Don't use:
If Jup(1) Then Y=Y-1
If Jdown(1) Then Y=Y+1
Use this instead:
Y=Y+Jup(1)-Jdown(1)
Remember: If it is True, -1 is returned, otherwise 0 is returned.
The Turbo (Plus) extension has a useful command called "Texp" which
returns specified values for True and False. Although it isn't very
fast, it produces neater code.
*M Don't use parameters in Def Fns. As functions are local to procedures,
any variables used, will be known in the function.
Eg.
Don't use:
Def Fn C(x,y)=(x*y*y+x) mod 16
For x=0 To 10
For y=0 To 10
Plot x,y,Fn C(x,y)
Next y
Next x
Use this instead:
Def Fn C=(x*y*y+x) mod 16
For x=0 To 10
For y=0 To 10
Plot x,y,Fn C
Next y
Next x
*S Use as few brackets in equations and If structures as possible.
Eg.
Don't use:
A=(B*C)-(D*E)-(F+G)
C=(A=2 or B=100)
Use this instead:
A=B*C-D*E-F-G
C=A=2 or B=100
*M The fastest way to display a map, using a few of the above tips:
' This is the slow way, that I've seen in multiple sources of games
For Y=0 To FHV
For X=0 To FWV
Paste Icon 16+X*16,16+Y*16,Peek((FPOSY+Y)*FW+Start(BLS1)+FPOSX)+1
Next X
Next Y
' Now my quick version... Note that the start(bank)
' is placed in front of the routine.
' FHV = Field Height View
' FHV = Field Width View
' FPOSX = Field POSition X (Where the player is looking in the map)
' FPOSY = Field POSition Y
POS=Start(BLS1)+FPOSX
For Y=0 To FHV
MPY=(FPOSY+Y)*FW+POS
SPY=16+Y*16
For X=0 To FWV
' Using a Turbo Plus extension command
F Paste Icon 16+X*16,SPY,Peek(MPY+X)+1
Next X
Next Y
*V Try various ways of writing code to see if you can find the quickest.
THIS is probably the SINGLE most important TIP for blazing speed.
You can fine-tune, even convert it to pure Assembler but, it may be MUCH
slower than another method you have not yet tried!!
Find that faster method!!
Something like this is a nice way to test your improvements:
Timer=0
For n=0 To 10000
[unoptimised code]
Next n
Print "Time for unoptimised code:";Timer
Timer=0
For n=0 To 10000
Optimised code
Next n
Print "Time for optimised code:";Timer
Note: This kind of test becomes quite useless if you have a CPU cache -
The second loop may report to be slower even if it's exactly the same.
When doing such tests, you should disable the cache if possible.
The key to gaining the absolute BEST speed is to ask yourself WHY is this
routine so slow? WHAT is slowing it down? The excellent tips above will
probably provide you with the answer more often than not but the secret
is you have to know what to look for. Think about what the computer does
BEHIND THE SCENES.
For example, why are two-dimensional arrays slower than single-dimensional
arrays? Or, for that matter, why are even one-dimensional arrays slow?
Think of it like this... When you access an element in a single-
dimensional array the computer has to find the the Base address of the
array. Then it must find the offset to reference to element you requested.
If you had something like A=MyTable(10) the computer would process this
along these lines:
1. Find base address of array MyTable.
2. Calculate offset of element 10 by multiplying 10 X 4 (4 is size of
integer).
3. Now add offset to array base to determine the element's memory
address.
4. Finally, extract the value held in this memory location.
As you can see that's quite a bit of work (particularly in a loop) just
to do something that appears so simple on this end.
Now, for a two dimensional array, the computer has twice the work, two
multiplies to calculate to determine the address of any given two
dimensional array elements such as A=MyTable(10,10).
Like the Gosub replacing Procedure tip, this is a bit messier but,
everything has a price so try this:
MyTableBase=VarPtr(MyTable)
Now instead of accessing the array through AMOS with A=MyTable(10)
access it directly yourself like this A=Leek(MyTableBase+40). The 40 is
the offset for element 10, multiplied by the integer size, 4.
Used in a loop this access a one-dimensional array 20% faster than
the standard AMOS code (ie A=Array(element)) and is 33% faster at
accessing two-dimensional arrays.
If you don't need long-word size variables, you can substitute Deek
for Leek for a TINY bit more speed (allowing 25% faster access of
one-dimensional arrays and 40% faster access of two-dimensional arrays).
Should be better but, AMOS's Peek, Deek and Leek are kind of slow
themselves!!
------------------------------------------------------------------------------
If you find any general purpose tips, please send them to me (at
bwyatt@paston.co.uk) and I'll add them to the list!
If you want proof that these tips do work, then take a look at two of my
recent games, Borisball and Knockout 2. Apart from being tremendously good
fun, they demonstrate how the AMOS speed limit can be broken!! Their Aminet
positions are:
game/demo/BorisBall.lha
game/2play/KnockOut2.lha